Learning Tree Augmented Naive Bayes for Ranking
نویسندگان
چکیده
Naive Bayes has been widely used in data mining as a simple and effective classification algorithm. Since its conditional independence assumption is rarely true, numerous algorithms have been proposed to improve naive Bayes, among which tree augmented naive Bayes (TAN) [3] achieves a significant improvement in term of classification accuracy, while maintaining efficiency and model simplicity. In many real-world data mining applications, however, an accurate ranking is more desirable than a classification. Thus it is interesting whether TAN also achieves significant improvement in term of ranking, measured by AUC(the area under the Receiver Operating Characteristics curve) [8, 1]. Unfortunately, our experiments show that TAN performs even worse than naive Bayes in ranking. Responding to this fact, we present a novel learning algorithm, called forest augmented naive Bayes (FAN), by modifying the traditional TAN learning algorithm. We experimentally test our algorithm on all the 36 data sets recommended by Weka [12], and compare it to naive Bayes, SBC [6], TAN [3], and C4.4 [10], in terms of AUC. The experimental results show that our algorithm outperforms all the other algorithms significantly in yielding accurate rankings. Our work provides an effective and efficient data mining algorithm for applications in which an accurate ranking is required.
منابع مشابه
Learning the Tree Augmented Naive Bayes Classifier from incomplete datasets
The Bayesian network formalism is becoming increasingly popular in many areas such as decision aid or diagnosis, in particular thanks to its inference capabilities, even when data are incomplete. For classification tasks, Naive Bayes and Augmented Naive Bayes classifiers have shown excellent performances. Learning a Naive Bayes classifier from incomplete datasets is not difficult as only parame...
متن کاملTitle: Incremental Learning of Tree Augmented Naive Bayes Classifiers Authors:
Machine learning has focused a lot of attention at Bayesian classifiers in recent years. It has seen that even Naive Bayes classifier performs well in many cases, it may be improved by introducing some dependency relationships among variables (Augmented Naive Bayes). Naive Bayes is incremental in nature but, up to now, there are no incremental algorithms for learning Augmented classifiers. When...
متن کاملIncremental Learning of Tree Augmented Naive Bayes Classifiers
Machine learning has focused a lot of attention at Bayesian classifiers in recent years. It has seen that even Naive Bayes classifier performs well in many cases, it may be improved by introducing some dependency relationships among variables (Augmented Naive Bayes). Naive Bayes is incremental in nature but, up to now, there are no incremental algorithms for learning Augmented classifiers. When...
متن کاملOne Dependence Augmented Naive Bayes
In real-world data mining applications, an accurate ranking is same important to a accurate classification. Naive Bayes (simply NB) has been widely used in data mining as a simple and effective classification and ranking algorithm. Since its conditional independence assumption is rarely true, numerous algorithms have been proposed to improve Naive Bayes, for example, SBC[1] and TAN[2]. Indeed, ...
متن کاملA New Hierarchical Redundancy Eliminated Tree Augmented Naive Bayes Classifier for Coping with Gene Ontology-based Features
The Tree Augmented Naı̈ve Bayes classifier is a type of probabilistic graphical model that can represent some feature dependencies. In this work, we propose a Hierarchical Redundancy Eliminated Tree Augmented Naı̈ve Bayes (HRE–TAN) algorithm, which considers removing the hierarchical redundancy during the classifier learning process, when coping with data containing hierarchically structured feat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005